home *** CD-ROM | disk | FTP | other *** search
- This is html2ps 0.1 beta, an HTML-to-PostScript converter.
-
- THIS SOFTWARE IS PROVIDED "AS IS" WITH ABSOLUTELY NO WARRANTY.
-
- The present version of html2ps is written in Perl. Perl is available
- from any comp.sources.misc archive.
-
- The latest version of html2ps is available by anonymous ftp from
- "ftp://ftp.tdb.uu.se/pub/sources/html2ps/". You can either fetch the
- tar file, or the files individually. Use binary mode when retrieving
- the files.
-
- Some PostScript code and some ideas have been taken from the PostScript
- generator in NCSA's Mosaic, by Ameet A. Raval & Frans van Hoesel.
-
- The program has been developed and tested on different Suns running
- Solaris 2.3 with perl version 5.000, and SunOS 4.1.3 with perl version 4.0.
-
- Author:
- Jan Karrman, Dept. of Scientific Computing, Uppsala University, Sweden,
- e-mail jan@tdb.uu.se.
-
-
- Features:
-
- * Most HTML tags are handled, see Notes/Bugs below for exceptions.
-
- * Scaling of the text to any size is possible (the line and page breaks
- will off course be adjusted to fit the page).
-
- * It is possible to change the sizes and styles for all the 6 header
- levels individually.
-
- * The font size used for preformatted text may be changed.
-
- * The size of the page can be adjusted. The defaults are adapted to the
- A4 paper size.
-
- * The margin sizes may be changed.
-
- * Different fonts can be selected. You can easily add new fonts, an
- example is given in the Perl script.
-
- * Printing in landscape mode is supported.
-
- * Anchor texts are underlined by default, this can be turned off.
-
- * No syntax check of the HTML code is done by the converter, but it is
- possible to call an external HTML checker, specified via the command
- line options. The default syntax checker is weblint.
-
- * Page numbers can be inserted.
-
- * A heading tag will cause a page break if the text is close to the end
- of a page.
-
- * Highlighting tags is additively interpreted. For example, the HTML
- code "<B><I>some text</I></B>" would produce bold italic text.
- This can be turned off so only the innermost tag is interpreted
- (here, the italics).
-
- * You can force a page break by including the comment <!--NewPage-->
- in the HTML document, at the point you want the page break. This
- action is not defined in the HTML specification. I would like to
- have a special character (eg &page;), ignored by screen browsers,
- but used to force a page break when printing a document.
-
- * The generated PostScript code is very compact, it will be less than
- the size of the HTML file plus the size of a PostScript header
- (presently about 8 kilobytes).
-
-
- Notes/Bugs:
-
- * In-line images are not handled. -- If the ALT attribute of <IMG> is
- present, the corresponding text is written in place of the image.
- If there is no ALT attribute, the text "[IMAGE]" is written (this
- text can be changed via the command line options).
-
- * The <ISINDEX> tag is not implemented. -- Ignored.
-
- * The <FORM> and associated tags are not implemented. -- Ignored.
-
- * <PRE WIDTH=...> is not implemented. -- Handled as <PRE>.
-
- * The text between two HTML tags or special characters will be converted
- into one PostScript string. If there is a large chunk of text between
- two successive HTML tags (typically between <PRE> and </PRE> tags),
- the length of the PostScript string may exceed some limitations in
- the PostScript printer/viewer. As a work around, you can insert a few
- HTML comments in the document.
-
- * The string "<PLAINTEXT>" in an HTML document terminates the HTML
- entity, even if it is within an <XMP> or <LISTING> element. The tags
- <XMP>, <LISTING> and <PLAINTEXT> are obsolete, you should use <PRE>
- instead.
-
- * The PostScript code generated by html2ps is not compliant with the
- Document Structuring Conventions. This is because the line breaking
- (and hence the page breaking) is done within the PostScript code
- itself. So the %%Page and %%Pages comments cannot be generated in
- advance.
-
- * A line break may occur at the position of a tag. For example, the code
- "It looks <EM>awful</EM>. Hopefully it will be fixed.", may give a line
- break between the word "awful" and the period.
-
-
- ToDo:
-
- Try to fix as much as possible in the Notes/Bugs section. The lack of
- support for in-line images is perhaps the major deficiency of html2ps.
- It will probably not be implemented in the near future though.
-
-
- Installation (Unix):
-
- I have only tested html2ps on Unix systems, I would like to hear your
- experiences with installing it on other platforms.
-
- Make the Perl script executable (chmod 755 html2ps). To convert an HTML
- file to PostScript, call the script with the HTML-file as parameter.
- Use the -o option to save the PostScript code to a file, or redirect
- the output in standard UNIX manner. For example:
-
- html2ps test.html > test.ps
-
- If you want to use the syntax check feature, you have to install some
- HTML syntax check program. The program called by default is weblint
- by Neil Bowers, available via anonymous ftp from ftp.khoros.unm.edu
- in /pub/perl/www.
-
- The file html2ps.1 is a manual page, move it to a directory where the
- manual pages are kept, and read it with 'man html2ps'. There is also
- a plaintext version in the file manpage.txt.
-
-
- History:
-
- * 12 Dec 1994: Version 0.1 beta released.
-